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© GENE SIGNATURE. 

Q\ A 3'-directed cDNA library which accurately reflects the abundance ration of mRNA in a cell has been 
prepared from various human tissues, and sequencing of the cDNAs contained in the library has be conducted 
tolcamine the incidence of each cDNA in each tissue. As each cDNA has expression -nformaton vvrth each 
tissue corresponding to the mRNA concentration, these cDNAs are usable as a probe or primer for detecting cell 
anomoly or discriminating cells. The cloned gene can produce portetns utilizabte as a medicine or the like. 
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The present Invention relates to purified single-stranded DNA molecules, purified single-stranded DMA 
!!. ^ eS ^K,r P T entary the,et0 or purified doubl °-stamJ*d DNA molecules consisting of said single- 
JSlfi^ ™ ,ecules - " hich «n specifically hybridize to human genomic DNA. human cDNA or human 
mRNA at particular s,tes. The DNA molecules of the present invention can be used for detecting the overall 
or individual expression status of mRNAs coding for the corresponding cellular proteins, detecting and 
ST Inn 9 " ,,ula : ii abn 1 ormalities du « 'o disease and viral infection, or distinguishing and identifying tte cell 

E'd^ c.S e nMA T 9 ,^? PW " Bd in 3 tissue - s P ecific "~ The present invention further 
Mudes cloned DNA molecules which can be used to produce proteins useful as pharmaceutical products 
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is Recognizing the importance of the most fundamental attribute of mRNA. that is. "the nature of the cell 
te determined by the expression pattern of genes as reflected in the population of mRNA", the Inventors of 
the present Invention have proposed "body mapping" as a unique approach to their objective. This is an 

« n „r L n ^i^r« Pl ******* information on 9ene *" presumably about 200 different 

kinds of cells and tissues present in the human body and elucidate when, where and to what externa 

20 ^f" 8 fe TT 1, , and map 80065 to *° respectlv9 or <*" typo in which J?£ZZ£L 
White a vanety of cells in the living body express various proteins depending on their respective 
biotogical functions, the intracellular concentrations of these proteins vary according to Z «S tSTSl 
of development and differentiation, environment, etc. 98 

* -« J« S !S •"""J™ """""J 1 into " oenes encodi "9 P roteins ^serrtial for the life of the cell" and 

* genes encoding proteins responsible for functions specific to the cell". Of these two. "genes encooTno 
proteins essential for the life of the cell" are expressed constantly in all types of cells WlTS 2 
"housekeeping genes^ while "genes encoding proteins responsible for funclns specific to thf cel^e 
often expressed specifically in a particular type of cell, or a particular group of cells, and al» may to 
specifically expressed at a particular stage of cellular development and differentiation. Furthermore thev are 

30 A J^JEE TJZ T 1 * °l ** expression ***** * iTSiS." 

^infi ' WOrdSl 06,18 9 row 88 a r «sut of the expression of "genes encoding 

orotems essential for the We of the cell" and display their specific functions as a result of the expression * 
genes encoding proteins responsible for functions specific to the cell" H 

i„ ri J?Zr^',, Under ^HT 31 CellUlar cond,tlons due to <"*■■• or Infection, the expression of genes within 
ndmdual cells .s altered as compared with that under the normal conditions. Especially during vVra" 
infection, RNAs encoding virus-specific proteins are synthesized in large amounts wRhfeMtoort Ib2n? to 
the production of said protein h large amounts. In other words. th7atteration inTe £2J °teve? S 
genes with.n the cell, especially as reflected in the concentration of intracellular mRr^ cantead tosuch 
abnormal cellular conditions as seen in diseases. 

Thus, the function of each cell in the living body is closely related to the expression status of genes 

L nil t**?"*' " 0rd8r 10 9lUCidate * e funCt0n 01 each <*" - rnolecular level or tehiSS 
the pathogenesis of a d.sease at molecular level, it becomes necessary to comprehend the exwesston 
status of cellular genes, especially the intracellular concentration of eachmRNA. expression 

far Lrr- r ^" y J ,0SSib,e aPPr0aCh 10 *" 0tii9C ^ lS •» extract,on -""V* all cellular proteins 
m„?™ !f 0n 1 T* 35 ' 0 " ^? tUS " Howwer ' althou 9 h R mt * b ° «x«W» to ^late a specific protein, in 
most cases rt .s almost impossible to completely isolate all of these proteins, because agreat variety of 
proteins are expressed within the cell. a vaneiy 01 

Another approach Is to directly estimate the concentrations of cellular mRNAs corresponding to all 
^P^^- Hoover, although it may be possible to isolate a specific wBM^7pS££ 
TT? * wmpletely isolate all of these mRNAs and directly estimate their amounts. becausVaowZ 
variety of mRNAs are synthesized simultaneously within the cell and furthermore they may te££btart 
susceptible to enzymatic degradation during their extraction. unstawe and 

This invention aims to provide DNA molecules which can be used as probes or Drimers required m, 
detecting me overai. or individual expression status of mRNAs coding K^o!SEfi13£ 
proteins. detect,ng or diagnosing cellular abnormalities due to disease or virus irrfoction.TognLng and 
identirying vanous cell types, and efficiently cloning genes expressed in a tissue-specific manner Moreover 

SE^S^" C,0n9d m0teCUteS ^ ™ ^ £ ^ 
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Summary of the invention 

in general, the genetic information flows In order from DMA to mRNA and to protein (F. H. C . Cricfc 
19SB). Sat J the information for the amino acid sequence of a protein" is first transcnbed ,nto mRNA and 

5 *^eSS £ *SL detail mammaiian flenes commonly comprise a region encoding a protein and 
a riTegulattng ft. expression of said gene. The regions of a gene encoding P rotein(caitad "exons ) 
IrlZn Seated by intervening sequences (called "introns"). When a gene «^.* R NMhe 
™7of mTorecursor RNA (pre-mRNA) are excised and exons are connected in tandem to form a 

,„ SnuoJs^cTe SnX a pabular protein (this process is called "splicing-). On the other hand, the 

10 25 ^iST* e^on S gene comprises, in addition to the regions direct* regulating tr^cnp- 
Z suS as a promoteTand operator which are present upstream of the transcript.cn region, untranslated 
r^i ona are tocated Iwth upsUam (5') and downstream (3 1 ) of the coding region. In particular, 3" 
2££ S 3 ^) KnportaVfor reguiating expression, since it contributes to f tajjjcrt «d 

« Sv nf mRNA During the processing of pre-mRNA. a methylated cap .s added at its 5 end. the 3 
^la2d SSn TZZ STspecSic site, a poly(A) tail is attached by adding 100 - 200 adenylate 
S„e^th7cSaved Tend, and the coding regions are spliced together to form mRNA. The protein is then 

^Z^^^^Z,^:^ that, in genera,, when the in~e^of a 
oarticular mRNA Is high. me expressed amount of the corresponding protein is also elevated, and also Wat 
ttt^sTbb to esfimate the relative concentration of each * 
JrtracSular concentration of the corresponding mRNA [DNA sequence 2. 137-144 (1991); Nature genet.cs. 

" 'SSfy'X present invention. mRNA is extracted from a particular cel. and cD-^is ^^by 
™^n«l methods usino reverse transcriptase. However. In the present invention. cDNA is synthesized 
^T^^^^ ^Z^LtorsTZ present invention so as to reflect the relative Intracellular 
Sr^nlf 25K cDl^ibZt construed and a group of cDNAs representing the po P u,a«on of 

^Zro^hS^rS simnar to me one used by the inventors of the present Invention but 
is en^nS STmethod of Coning of a cDNA library constructed by the random pruning by 

Ve "5ent?s group randomly cloned cDNAs from commercially avaiiable cDNA libraries .^ved^m br^ 
cells (catalog No. 936206, 936205 or 935. Stratagene. California) and determ.ned therr base sequences 
rsdance 252. 1651-1656 (1991); Nature 355, 632-634 (1992)]. . 
3S ^Xe^ method used by Venter eTal. invoives sequencing of cDNAs obtained by random pruning, 

*T2£ rtdTm SS^Sfreglons of a singled mRNA may often lead to the formatt>n 
2 IZ cC fr^lrL wrthout an? mutual overtopping portions, it is difffcuit to determme whether 
these cDNA fragments are derived from the same mRNA or a different one, 
„ *) The lo^er a mRNA strand, the higher the chance for said mRNA to be reverse-transcribed Into 

a^ce"! availability of each primer to be used among random primers differs depending on their 
base sequences, the relative frequency of cDNA synthesis is variable. 

>L^rforemention9d reasons, the relative frequency of appearance of cDNA does not ^ct*e 
45 reiat^ ^centra** of cellular mRNA. Consequent*, H is impossible to ™™ZTJSV<£TT~ 
each mRNA and the actual population of intracellular proteins by using the method of Venter et a. 
iwSer wTtnTme^vlped by the inventor of the present Invention, It Is possible to construct 
a cD^TbTa^hS Sly reflects the relative concentration of mRNA without any o the aforernen- 
Ln^comoBcations Since, in the present invention. cDNA is synthesized usrng only "pdy-T" as the 
honed compBcrtions. t-"ca p Therefore, the synthesis of cDNA with "poly-T" as the 

" ttS;C££?£VrZ eTresu^ng in the ,orma«on of £££ cDNA JJ» ij* 

« ^quScf^TC ^sutts in the formation of cDNA extending from the 3;-terminus to . the f^Mbo. 

in the present invention, each cDNA thus cloned and included m "a cD NA^bn^afthfully 
reSng t? eLve Intracellular concentration of mRNA" is cailed a "gene signage" 
hUSS^A Q3 includes not only the double-stranded DNA but also each smgle-stranded DNA thereof. 
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The present invention relates to a purified single-stranded DNA, purified single-stranded DNA com- 
plementary thereto, or a purified double-stranded DNA consisting of said single strands, containing ali or a 
portion of a single-stranded DNA (or a single-stranded DNA complementary thereto) comprising any of the 
base sequences listed under the sequence identification number (SEQ ID NO) 1 - 7837 and hybridizing 
s specifically to a particular site of human genomic DNA, human cDNA or human mRNA. The present 
invention also relates to probes and primers consisting of said single-stranded DNA. The present invention 
also relates to a purified single-stranded DNA, a purified single-stranded DNA complementary thereto, or a 
purified double-stranded DNA consisting of said single strands, containing all or a portion of a single- 
stranded DNA (or a single-stranded DNA complementary thereto) which is complementary to a human 
io mRNA containing any of the base sequences listed under SEQ ID NO 1 - 7837 ( wherein T is read as U ) or 
any portion thereof at its 3' region and hybridizing specifically to a particular site of human genomic DNA, 
human cDNA or human mRNA. The present invention also relates to probes and primers consisting of said 
single-stranded DNA. 

The present invention is explained further in detail as follows. 
is The DNA of the present invention not only includes a single-stranded DNA (or a single-stranded DNA 
complementary thereto) comprising any of the base sequences listed under SEQ ID NO 1 - 7837 but also 
includes a single-stranded DNA containing a portion of said single-stranded DNA (or said single-stranded 
DNA complementary thereto) H it hybridizes to human genomic DNA, human cDNA or human mRNA. 

Furthermore, the DNA of the present invention not only includes a single-stranded DNA (or a single- 
20 stranded DNA complementary thereto) which is complementary to a mRNA containing any of the base 
sequences listed under SEQ ID NO 1- 7837 (wherein T is read as U) or any portion thereof at its 3' region 
but also includes a single-stranded DNA (or a single-stranded DNA complementary thereto) containing a 
portion of said single-stranded DNA (or said single-stranded DNA complementary thereto) if it hybridizes to 
human genomic DNA, human cDNA or human mRNA 
25 In addition, the DNA of the present invention not only includes a single-stranded DNA or a single- 
stranded DNA complementary thereto but also includes a double-stranded DNA consisting of said single 
strands. 

Obviously, the term "contain" as used herein does not necessarily mean that the DNA of the present 
invention contains at a single site without interruption (l) "a single-stranded DNA (or a single-stranded DNA 

30 complementary thereto) comprising any of the base sequences listed under SEQ ID NO 1-7837 or a portion 
thereof" or (2) "a single-stranded DNA (or a single-stranded DNA complementary thereto) which is 
complementary to a mRNA containing any or any portion of the base sequences listed under SEQ ID NO t 
- 7837 (wherein T is read as U) at its 3' region or a portion of said single-stranded DNA." In other words, 
the term "contain" Is applicable also to the case where one or more exogenous bases are Inserted In the 

35 base sequence of the DNA (1) or (2). 

The hybridization to a particular site of human genomic DNA, human' cDNA or human mRNA can be 
achieved under standard conditions (see e.g., ,Molecular Cloning: A Laboratory Manual, Sambrook, J et at 
Cold Spring Harbor Laboratory Press, 1989). In the following preferred embodiment there will be described 
methods for constructing a cDNA library which reflects precisely the relative intracellular concentration of 

40 mRNA. cloning cDNA groups which correspond to total mRNA, and determining the base sequence of each 
cDNA. 

First, cells from specific tissues, for example, cells from organs, for example, cells derived from human 
liver (HepG2) are grown, and the total mRNA is extracted by standard procedures. mRNA thus obtained is 
attached to a vector to construct a cDNA library. 

For example, mRNA is attached to the vector plasmid pUC19, which has the M13 sequences flanking 
the cloning site, as follows. 

PUC19 is cleaved by Hindi and Pstl and poly-T of 20 bp - 30 bp is added to the Pstl-digested end to 
which the 3'-end poly-A tail of the mRNA is hybridized (Fig. 1a). After the DNA strand is extended with 
conventional methods using reverse transcriptase, a double stranded DNA is formed with DNA polymerase 
so (Fig. 1b). The double stranded DNA thus obtained is cleaved with the restriction enzyme Mbol which 
recognizes a specific four base sequence (Fig. 1c). 

Mbol, which recognizes a four base sequence (QATC), cleaves the DNA within a few hundred bases 
from the poiy-A tail. Since Mbol is found to digest, without exception, about 300 human cDNAs which were 
randomly selected from the GenBank data base by the inventor of the present invention, this enzyme 
cleaves the cDNA to be cloned at a specific site. In addition, as pUCl9 is prepared in dam + E. coli, eg E 
coli JM109 and since its adenine at the Mbol recognition site is methylated (G' n ATC). it is not cleaved by 
Mbol. 
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Subsequently, in order to prepare a vector containing the double-stranded DNA which has previously 
been attached to pUCl9 and has the Mbol-cleaved end. the pUCi9 DNA is digested with BamHI to make 
termini cohesive with the Mbol-cleaved end. Since the recognition sequence of BamHI (G GATC C) contains 
that of Mbol (GATC), the extended portion of the double-stranded DNA is not cleaved with BamHI. 
5 The resulting double-stranded DNA is then circularized by standard ligation methods, and the recom- 
binant vector piasmid thus prepared is introduced into E. coli. e.g,. E. coli DH5 In order to make a cDNA 
library. 

With this method, only a clone containing the base sequence upstream of the poiy-A tail of the mRNA 
is obtained. 

w Since the average size of the inserted cDNA fragment is relatively small, 270 bp, it is free from biased 
cloning resulting from variations in the efficiency of cDNA synthesis and transformation that occur in the 
case of larger sized DNAs. Furthermore, because instability due to repeated base sequences and the like is 
eliminated, the cDNA library of the present invention faithfully represents the relative concentration of 
mRNA in the cell. 

T5 Furthermore, when the cDNA inserted into the vector is relatively short, it is possible to accurately 
amplify the cDNA Iragment using the sequence of the vector flanking it as a primer. It is also possible to 
determine the base sequence from the 5* end directly by the PGR without interference from the 3' poly-A 
tail which will reduce the accuracy of sequence determination. 

Amplification of the GS, i.e.. the cDNA fragment inserted into the vector, is performed as follows. 
20 The E. coli cells in which the cDNA library is Introduced are grown using standard methods and lysed. 
Debris contained in the bacterial lysate are removed by centrifugation and the supernatant containing the 
vector DNA is recovered. The vector DNA thus obtained is used as the DNA template for amplification by 
the PCR (Fig. 1d, amplification with PCR primers 1 and 2). 

Base sequences flanking both ends of the GS is properly selected for use as primers and the PCR is 
25 performed under standard conditions. PCR products thus obtained are subjected to the elongation reaction 
using fluorescence primers complementary to the vector sequence flanking the 5* end of the GS, and the 
sequence is determined with an autosequencer (Fig. 1d t sequence determination with dye primer). 

Based on the results of the sequence determination of each GS, the species and the frequency of 
appearance of the GS in each tissue or cell type are analyzed. 
30 As to each cell type not only normal cells but also cells under pathogenic conditions (such as tumor 
cells, virus infected cells, etc) can be used without any restriction. For example, liver cells (from fetus, 
neonate or adult), various hematopoietic cells (granulocytic, monocytic etc.), lung cells, adipocytes, 
endothelial cells, osteoblasts, colon mucosa cells, retinal cells and hepatoma cells (HepG2, etc.), and 
promyelocyte leukemia cells (HL60, etc.) will be used. Trie appearance frequency for each GS is described 
35 for each cell type in Tables 1 through 219. There, patent number represents "SEQ ID NO for each GS", 
size represents the "length of each GS", and F represents the "sum of appearance frequencies in the cells 
I studied". In addition, hepG2 stands for "hepG2 (a liver cancer cell line)", HL60 stands for "HL60 

promyelocyte leukemia cell line", granulo stands for •granulocytoid. HL60 stimulated by DMSO", mono 
stands for "monocytolds, HLBO stimulated by TPA", 40 w liver stands for "40 w neonatal liver", 19 w liver 
40 stands for "liver of a 19 weeks old fetus, adult liver is "adult liver lung stands for "adult lung", adipose 
stands for "subcutaneous adipose tissue", endothel stands for "primary cultured aortic endothelium", 
osteoblast stands for "primary cultured osteoblast", colon mucosa is "colon mucosa", small cell card 
stands for "small ceil carcinoma of lung", retina is "retina", cerebral cortex is "cerebral cortex", adenocarci 
(lung) stands for "adenocarcinoma of lung", squamous cell ca (lung) stands for "squamous cell carcinoma 
45 of lung", keratinocyte stands for "primary cultured keratinocyte", fibroblast stands for "primary cultured 
fibroblast". Alzheimer stands for "Alzheimer temporal lobe", cerebellum stands for "cerebellum", visceral 
fat Is "visceral fat", corneal epithelium Is "corneal epithelium", peripheral granulocyte is "peripheral 
granulocyte", neuroblastoma Is "neuroblastoma" and taste bud of tongue is "taste bud of tongue". 

"Accession number of target mRNA" represents the accession number of the entry in GenBank 
so Release 79 whose base sequence has homology with that of each GS, "match %" represents the percent 
homology of the GS sequence relative to that of said homologous sequence, "match starts at (GS)" 
represents the base position counted from the 5'-end of the GS at which the region for homology 
calculation starts, "match starts at (GenBank)" represents the base position counted from the 5'-end of the 
GenBank sequence at which the region for homology calculation starts; and "GenBank target size" 
55 represents the whole length of the GenBank sequence corresponding to the GS. The columns in Tables 1 - 
219 represent the same items as in Table 1. 

Based on the data In Tables 1 - 219, each GS can be classified into several groups. A GS, which Is 
expressed at high frequency in a specific cell or groups of cells with similar property, for example, 
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promyelocyte leukemia cell, granulocyte and monocyte and not expressed entirely or expressed very little 
in other cells (groups), is a likely GS corresponding to the gene encoding "the protein responsible for 
functions specific to the celP (e.g., GS0001553, GS0002047, GS004895, etc.). On the other hand, a GS, 
which is expressed commonly in every kind of cell, most likely corresponds to the gene encoding "the 

5 protein essential for the life of the cell" (e.g., GS0000019, GS0000155, GS000861, etc.). In addition, some 
GSs are expressed at low frequency (e.g., GS0000013, GS0002399, GS0003155. etc.). 

Since the GS with the sequence determined as described above will reflect the population of mRNA 
expressed in a particular cell, it must be possible to find the relative concentration of mRNA in each cell by 
determining the appearance frequency for each GS in a cDNA library derived from that cell. Therefore, to 

10 confirm the correlation between the appearance frequency for each GS in a cDNA library and the relative 
concentration of cellular mRNA the GS thus obtained was labeled with by standard methods and used 
as the probe in the following hybridization test. mRNA isolated from a specific cell is hybridized to said 32 P- 
labeled probe under standard conditions. The results of this Northern hybridization test were such that, 
when a GS found with high appearance frequency in a cDNA library was used as a probe, a dense band 

15 was formed, confirming the correlation of the frequency of appearance of the GS with the relative 
concentration of mRNA in the cell (see Example 5). 

Similarly, the colony hybridization test of the cDNA library constructed as described above with a 32 P- 
labeled probe prepared as described above showed a close correlation between the frequency of 
appearance of the GS and the number of colonies hybridized with said GS (see Example 6), confirming the 
so correspondence of the frequency of appearance of the GS and relative concentration of the GS in a cDNA 
library. 

From the above results, by determining the appearance frequency of each GS in a cDNA library 
derived from a variety of cells, it has become possible to determine the expression status of the gene (or 
mRNA) corresponding to each GS. This fact implies conversely that each GS may be useful for industrial 
25 purposes as a specific probe or primer encoding information about the expression status of its correspond- 
ing gene (or mRNA) for each cell. For example, when it is proven that "a certain GS appears at high 
frequency only in a cDNA library derived from tissue A, that is, the gene corresponding to said GS is 
specifically expressed only in tissue A", by conventional cloning of the corresponding full-length cDNA 
using said GS as a probe or primer, it is possible to clone a full-length gene which is expressed in a tissue- 
so specific manner. 

Furthermore, for example, when it is proven that "the frequency of appearance of a certain GS is low in 
a cDNA library derived from tissue B, that is, the appearance frequency of the gene corresponding to said 
GS is low in tissue B", by examining the expression frequency of the gene corresponding to said GS in a 
test sample of tissue B from a patient using said GS as a probe or primer, It may be possible to Identify the 
35 pathogenic gene, wherein an unusually high expression frequency of said gene being a strong indication 
that said GS may be the gene involved in the pathogenesis. Furthermore, by conventional methods for 
cloning said full-length cDNA using said GS as a probe or primer, it is possible to isolate said pathogenic 
gene and elucidate its characteristics. 

In practice, the DNA of the present invention may be used as a probe or primer for detecting and 
40 diagnosing disease, cloning a pathogenic gene or related gene, cloning a viral gene, identifying and 
recognizing cell types, cloning a species-specific promoter and gene mapping. 

One GS corresponds to one mRNA. It is therefore obvious that any portion of cDNA complementary to 
each mRNA carry the same "information for expression" as the GS. Accordingly, the DNA of the present 
Invention is not restricted to "the DNA comprising the GS Itself or portion thereof", but also includes the 
45 DNA comprising, for example, "a full-length cDNA complementary to each mRNA" and "the non-GS region 
of the cDNA complementary to each mRNA or a portion thereof". They can be used as a probe or primer 
comprising the same "expression information" as that of the GS and can be used as a probe or primer in a 
similar manner as a GS. For example, by using a GS or a portion thereof as a probe or primer, it is 
obviously possible for those skilled in the art to readily isolate "a fuK-tength cDNA corresponding to each 
60 mRNA" or "the non-GS region of the cDNA complementary to each mRNA or a portion thereof". For 
example, as described hereinafter, conventional techniques such as "5* RACE", "nesting" and "Inverse 
PGR" can be used. 

An example of the method for detecting disease using the GS of the present invention will be 
described. As shown in Tables 1-219, with the method described above it is possible to detect a GS * 
55 present specifically in a cDNA library constructed from each tissue by detecting and comparing the 
frequency of appearance of GS in each tissue. It is also possible to identify a GS corresponding to a protein 
which is expressed commonly in various tissues or which is expressed at low frequency. These GSs are 
denatured and then fixed on an appropriate filter, for example, nylon filter or nitrocellulose filter. It is 
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convenient to use a single fitter with many GSs fixed on it. Usage of a single filter on which many r*"*^ 
DNAs are fixed Is we! known. An example may be "the Escherichia coli <^ »<m 
rrakarashuzo code No. 9035). It is a single nylon filter on which the cosmid contigs of genomic DNA of E. 
Lof aTfS' ^possible to prepare a filter comprising a group of specific GSs corresponding to proteins 
Sr s e?in' a particular tissue, a fitter comprising a group of GSs corresponding to protems commonly 
in vaLs tissues, or a filter comprising a group of GSs corresponding to proteins expressed at 
S Suency Te single^tranded GSs fixed on these filters are then hybridized to labeled wmplementery 
Segment; ^SSTsing -random primers" prepared from template mRNA extracted from a test 
M Xgt" labeled nucleotides and reverse transcriptase Oabeled mRNA can ^e hgndo^ 
STfil er^ . Slmilariy. tebeled complementary fragments synthesized using ^ rnRNA extracted £» 
tissue as the template are hybridized (labeled mRNA can also be hybndized to the filters) If the profile of 
Sdizltion toT group of GSs has been categorized beforehand by ^^^^^J^^ 
vSous^ogenic tissues to that of corresponding normal tissues, It Is posstole todlagnose the P^°9^° 
ZSSon of a particular test tissue by comparing the hybridization profile of the test tissue ™tt 
Srre^ding norma, tissue and assigning that profile to a certain category. Vtus infection can be detected 
In the same manner as In the case of other diseases. 

Next, an example of the method for cloning pathogenic genes or their related genes^ using the GS of the 
prese^nverZ^ described. As described above, using the filter on which denatured GSs are fixed, the 
rSThrid^tion profile of various pathogenic tissues and that of corresponding normal tissues are 
^TT^letZlnctto the hybridization intensity between norma, and P^oflanlc tlss^s 
wT£Tan indication that the particular GS corresponds to a pathogenic gene, .f a filter «rnpns,ng ontyGSs 
JeSc for a particular tissue is appiied to a sample from that particutar tisauej he , probab ^^TS 
me GS with a great difference in hybridization intensity is elevated. Also a fitter comprising GSs 
the ta> wnn a 8"» Bxcression is low will facilitate the identification of the GS corresponding to 

tiTSSc ge^e TS deSn^ZsV signal, because the hybridbafion signal for these GSs Is 
uXCwS Once . GS corresponding to a pathogenic gene is found, said oogenic gene can be 
cS by eSabiished methods such as genomic Southern hybridization using said GS as a probe anoVor a 

^Furthermore, a method for cloning a full-length gene using a GS as a probe or primer is ^descjed in 
detail Qoned genes isolated In the present invention are also appropriate for use i« the production of 
n^ins useful as pharmaceutical products. mRNA is extracted from tissues by conventional methods and 
cSTa Hbra^Is a^CTpTpared (See Molecular Cloning. 2nd ed. Vol. 2. Section 8 New York: Cold Spnng 
S^ESSTw. £se. » ■» arable to extract mRNA from tissues in which the target gene is 
££So: method to detect a specific gene in Hbrar.es thus prepared <s. tor example , to^ct 
LSve dones via hybridization using a whole or partial GS as a probe. In general srnce a GS is specie 
ETZSSZ mRNA. hybridization can be carried out under certain stringent conditions. Probes used are 
It ££££ *t»l » »2L long, preferably more than 50 bases long, and more preferably more than 100 

^uSSmore If cDNA libraries, in which the cDNA for a specific gene is concentrated, are prepared, 
mey jTpTfelfc for selecting said specific gene. One method useful for this ^J 8 ^^ 
foltows- D preparation of an affinity chromatographic column of resin on wh.ch the dena^red GS 
c^pondingTL specific gene is fixed: 2) application of mRNA extracted from a tissue ^art column 
•nTnSoTof the mRNA species corresponding to the specific gene on Hid column; 3) elutionand 
SncTnSon of said retained mRNA; and finally 4) preparation of cDNA librar.es using ^ concenti^d 
nrfUUwedee as the template. Another method is the selective amplification of cDNA corresponding to the 
LmcMc bene by the PCR. Selective amplification of a specific gene is carried out as follows: using a partial 
ZSZZSi GsVcaDzed toward the T end of the specific gene as 

mRNA with reverse transcriptase and 4 NTPs. To the 3" end of a single-stranded cDNA thus obtained a 
ho^oTvmeTS a7pX-T rattached by the action of -terminal deoxyribonucleotide transferase (TdT)". 
TZZru^^r o^^^^e homopolymer- and "a primer us^inj^d reverse 
LirSe reason, or a primer whose sequence is included in the same < ^SSJX7S^ 
rtTff end" cDNA corresponding to the specific gene may be selectively amplified by the PCR [see 
TmC^t Ra?d Am^ttTcaSn ofcDNA ends): PNAS. Vol. 85. pp. 8998 - 9002 (1983): N^** ^ 
Vol 17 dp 2919-2932 (1989)]. In addition. Instead of the attachment of a homopolymer there is another 
Vol. 17, pp. «ana * _,_„. u a sjnpte stranded anchor DNA is linked to the 3' end of a single 

Z2* anchor DNA [Nucleic Acids Res.. Vol. 19. pp. 5227-5232 (199m 8* i primer Is 
mom than 13 bases long, preferably more than 15 bases long, and more preferably more than 18 
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bases long. Furthermore, in order to enhance the efficiency of heat denaturation in the cycling reaction, said 
primer Is preferably less than 50 bases long and more preferably less than 30 bases long. By linking said 
amplified DNA to a vector, a cDNA library concentrated with respect to the target gene is prepared. 

In addition, it may be also possible to isolate a cDNA clone corresponding to the specific gene directly 
from the PCR products. Specifically, the PCR products are first separated by gel electrophoresis, subjected 
to Southern blotting analysis using the denatured GS as a probe, and examined for the presence of a band 
which specifically hybridizes to said GS. If a GS-hybridized band is detected, it is highly possible to isolate 
the cDNA clone corresponding to the specific gene by excising said band from the gel and subjecting it to 
direct cloning. 

As described above, in order to further amplify the specific gene previously amplified by the PCR, it 
may be possible to perform the second PCR of the primary PCR products by replacing either or both 
primers previously used with a primer having the base sequence internal to said two primers (nesting) 
(Journal of Virology, Vol. 64, p. 864 (1990)). Nesting may be performed directly upon the products of the 
primary PCR. Alternatively, if a band which specifically hybridizes to the GS is detected by the Southern 
blotting analysis of the primary PCR products, nesting may be performed for the DNA obtained by excision 
of the band followed by extraction. In the case where a band which specifically hybridizes to the GS is 
detected by the Southern blotting analysis of nested products using the denatured GS as a probe, it is 
highly possible to successfully isolate the cDNA clone corresponding to the target gene by excising said 
band from the gel and subjecting it to direct cloning. 

The isolated cDNA clone corresponding to the target gene may often correspond to the full-length 
mRNA, but it may be a cDNA with the 5 1 end deleted. In the case where the 5* end is deleted it is possible 
to isolate the full-length cDNA clone by conventional methods. For example, by screening a cDNA library 
using a probe comprising the base sequence in the 5' end region of the cloned cDNA, since the target 
position of said probe is shifted further toward the 5* end of the fuiWength cDNA than in the case of using a 
GS as a probe, it is possible to isolate only longer cDNA clones as the positive clone. Also by synthesizing 
cDNA using "a primer comprising the base sequence in the 5* end region of the cloned cDNA" with mRNA 
as the template followed by PCR amplification of "a single stranded cDNA having a homopotymer or anchor 
DNA sequence at the 5' end" and using" the primer used for previous cDNA synthesis or a primer having 
the sequence internal to that of said primer" and "a homopolymer or a primer complementary to anchor 
primer" as described above for the 5' RACE method, only the sequence toward the 5 f side of the cDNA 
may be selectively amplified since the position of said primer is shifted further toward the 5' side of the full- 
length cDNA. Even if the cDNA thus obtained has a deletion at the 5' end, the population of cDNA 
fragments covering the full-length of the long cDNA may be obtained by repeating this procedure. It may be 
easy for those skilled In the art to obtain a full-length cDNA by suitably linking said cDNA fragments having 
overlap segments together. 

Alternatively, by performing the inverse PCR (Inverse PCR: Genetics, Vol. 120, p. 621 (1988); Molecular 
Cloning, 2nd ed., Vol. 2, 14.12-14.13 (New York; Cold Spring Harbor Laboratory)), It may be possible to 
isolate a cDNA clone extending externally from the GS, that is. in the genomic DNA region. Specifically, the 
target DNA (genomic DNA or cDNA) is digested with restriction enzymes Into fragments of about 2-3 kb 
and then circularized by ligating the cleaved ends. By performing the PCR for said DNA using "a set of 
primers which are complementary to the cDNA clone isolated using the GS or the GS as a probe or primer, 
and thereby making the direction of DNA synthesis mutually opposite (outward), it may be possible to 
amplify the DNA region extending externally from the GS. There is known a method to isolate a full-length 
genomic DNA of a specific gene by repeating this procedure (Nucleic Acids Res., Vol. 16, p. 8186 (1988)). 

In addition, although "Taq polymerase" is conventionally used in the PCR described above, the cloning 
procedure may be more efficiently performed using the "LAPCR (long and accurate PCR" technique 
(Nature Genet., Vol. 7, p. 350-351 (1994), Nature.. Vol.3B9, p.684-685(1994)). 

Furthermore, needless to say that by linking said full-length gene thus obtained to a suitable expression 
vector followed by its expression in an appropriate host, it is possible to obtain the corresponding gene 
product (Molecular Cloning, 2nd ed.). 

Next, there will be described an example of the method for identifying and recognizing cell types using 
the GS of the present invention. As shown in Tables 1-219, based on the appearance frequency of GS in 
each tissue and its comparison among tissues, it is possible to identify those GSs specifically present in a 
cDNA library constructed for each tissue. These "tissue-specific GSs" are fixed on a filter. It will be more 
convenient if GSs specific to each tissue are collected and fixed on a filter as a whole (e.g. ( a GS block 
specific for hepatocytes or cerebral cortex cells). As described above, to this filter are hybridized labeled 
complementary fragments synthesized using "random primers" prepared from mRNA extracted from test 
tissues or cells, "nucleotide containing 4 labeled nucleotides", and "reverse transcriptase". (Directly labeled 
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mRNA can also be hybridized to the filters.) Depending on the type of tissues or cells, intense hybridization 
signals will be observed with the GS groups specific to said tissue or cell. Furthermore, a tissue-specific 
promoter can be cloned by structure analysis of the 5 f upstream sequence through the cloning of the 
corresponding gene using established methods such as genomic Southern hybridization with the "tissue- 
specific GS" as the probe and/or primer. 

Tfiese tissue-specific promoters thus obtained are useful for gene therapy in the future. 

Gene therapy in a narrow sense aims to supplement the defective protein of patients using gene 
technology, and in this case it is necessary to express the exogenous gene in a desired tissue in a desired 
quantity. For this purpose, a promoter which is known to be expressed in a specific tissue in a desired 
quantity (in most cases a large quantity is desired) is highly useful. Although, at present, a virus promoter is 
often used, it can be inactivated by endogenous modification such as methylation. Promoters provided by 
tissue-specific GSs will be ideal substitutes for viral promoters. 

There will be described the method for chromosomal assignment of DNA corresponding to the GS of 
the present invention using the probe derived from the GS obtained as described above. 

First, the Southern blotting method will be described. 

According to this method, for example, chromosomes are isolated from a lymphoblast cell line of 
human normal karyotype (e.g., GM0130b), and then a monochromosomal hybrid cell is prepared by 
introducing each human chromosome into non-human cells, such as rodent cells, and cultured on a large 
scale by standard methods. Then the DNAs extracted from said hybrid cells are digested with various 
restriction enzymes and subjected to agarose gel electrophoresis. Then, the electrophoresed DNAs are 
hybridized to ^P-labeled GS prepared as described above and used as the probe. By identifying the hybrid 
cell the DNA of which is hybridized to said probe, it is possible to identify the chromosome in which the 
DNA corresponding to the GS of the present invention is present. Southern hybridization test of the total 
human genomic DNA using each labeled GS as a probe formed a single band corresponding to the GS, 
indicating that the DNA of the present invention can be used as a desirable probe for human genomic DNA. 
It is obvious that a desirable probe for human genomic DNA can be used also as a desirable probe for 
human cDNA and human mRNA. 

A method similarly using the PCR to determine chromosomal localization of the GS of the present 
invention will be described. 

To prepare most appropriate primers, base sequences are selected from the sequence of the GS In 
question by conventional methods, for example, by using the computer software OLIGO4.0 (National 
Biosciences) and the oligonucleotides (20-24mer) having the selected sequences are synthesized. The 
preferred size of the sequence to be amplified by the PCR is from 50mer to 100mer. 

Using the primers thus synthesized and the chromosomal DNA extracted from the monochromosomal 
hybrid cell as such as the template, amplification by the PCR is performed in a conventional manner. 
Resulting PCR products are subjected to non-denatured acrylamide gel electrophoresis and stained with 
ethidium bromide for fluorescent detection. The sizes of these PCR products are then determined. 

Chromosomal assignment is confirmed when the presence of a PCR product of correct size is 
confirmed. 

It is evident that a chromosome or chromosomes in which the DNA corresponding to a GS is localized 
can be identified by using these procedures. It has also become evident that the DNA of the present 
invention can be used as desirable primers for human genomic DNA since a single band has resulted from 
amplification of the total human genomic DNA by the PCR using primers designed based on each tested 
GS. Obviously, a desirable primer for human genomic DNA is also a desirable primer for human cDNA and 
human mRNA. 

Brief Description of Figures 

Fig. 1 shows the preparation of 3 1 Mbol cDNA library. 

Fig. 2 shows the results of tests of primers. A shows the location of primers on the vector; and B shows 
the electrophoretic patterns of DNA fragments amplified using the primers (A). Primers used are as follows: 
lane 1. FW (-40)/RV (-14); lane 2, FW (-40)/RV (-36); lane 3, FW (-40)/RV (-71); lane 4: FW (-40)/RV (-29); 
and lane 5, FW (-47)/RV (-48). Artifacts are indicted by arrows. 

Fig. 3 shows the electrophoretic pattern of PCR products using FW(-40) and RV(-14) as primers. The 
lane at the right end shows the electrophoretic pattern of size markers and the other lanes show the PCR 
products using FW (-40)/RV (-14) as primers. 

Fig. 4 shows the mRNA concentration reflecting the frequency of appearance of each GS In the cDNA 
library: especially, Fig.s 4A - 4D; experimental results; Fig. 4E, photographs of colonies; and Fig. 4F, 
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summary. 

Fig. 5 shows the appearance frequencies for various cDNAs in the 3'-directed HepG2 cDNA library. 

Fig. 6 shows the genetic mapping of each GS (gs) using PCR. 

Fig. 7 shows the genetic mapping of each GS (gs) using PCR. 
s Fig. 8 shows the genetic mapping of each GS (gs) using PCR. 

Fig. 9 shows the genetic mapping of each GS (gs) using PCR. 

Fig. 10 shows the genetic mapping of each GS (gs) using PCR. 

Fig. 11 shows the chromosomal mapping of GS001418 (gs001418) using PCR. 

Fig. 12 shows the chromosomal mapping of GS001457 (gs001457) using PCR. 
w Fig. 13 shows Southern blotting of human total chromosomes using the GS as a probe. 

Fig. 14 shows Southern blotting of human total chromosomes using the GS as a probe. 

Fig. 15 summarizes the characteristics of hybrid cells used for Southern hybridization. 

Fig. 16 shows Southern blotting of chromosomal DNA from the hybrid ceils using GS000152 (clone 
s1 4g02) as a probe. 

is Fig. 17 shows Southern blotting of chromosomal DNA from the hybrid cells using GS000041 (clone 
s650) as a probe. 

Fig. 18 shows Southern blotting of chromosomal DNA from the hybrid cells using GS000181 (clone 
hm01e01) as a probe. 

Fig. 19 shows Southern blotting of chromosomal DNA from the hybrid cells using GS000055 (clone 
20 d 3a 18) as a probe. 

Fig. 20 shows Southern blotting of chromosomal DNA from the hybrid cells using GS000180 (clone 
s479) as a probe. 

Fig. 21 shows Southern blotting of chromosomal DNA from the hybrid cells using GS000094 (clone 
s1 73) as a probe. 

25 Fig. 22 shows Southern blotting of chromosomal DNA from the hybrid cells using junk (clone hm01g02) 
as a probe. 

Fig. 23 shows the chromosomal mapping of each GS by Southern blotting. E stands for EcoRI, Ba 
stands for BamHI, Bg stands for Bglll and E/B stands for double cleavage with EcoRI and BamHI. 

Fig. 24 shows the chromosomal mapping of each GS by Southern-blotting. E stands for EcoRI. Ba 
30 stands for BamHI, Bg stands for Bglll and E/B stands for double digestion with EcoRI and BamHI. 

Fig. 25 shows the chromosomal mapping of each GS by Southern blotting. E stands for EcoRI, Ba 
stands for BamHI, Bg stands for Bglll and E/B stands for double digestion with EcoRI and BamHI. 

Fig. 26 shows the chromosomal mapping of each GS by Southern blotting. E stands for EcoRI, Ba 
stands for BamHI, Bg stands for Bglll and E/B stands for double digestion with EcoRI and BamHI. 

35 

Preferred embodiments of the invention 

In the following section, there will be explained preferred embodiments of the present invention. 
However, the present invention will not be restricted to these preferred embodiments. 

40 

[Example 1 ] 
Preparation of mRNA 

45 Cytoplasmic RNA was extracted from a liver cancer cell line HepG2 (Aden., et al., Nature 282, 615-617, 
1979) using standard procedures [Sambrook, j M et a!., Molecular Cloning, 2nd ed. (New York: Cold Spring 
Harbor Laboratory), vol. 1, pp. 7.3-7.36, 1989J. Briefly, HepG2 cells grown In Dulbecco's modified Eagle 
medium supplemented with 10% FCS were lysed in RNA extraction buffer [0.14 M NaCI, 1.5 mM MgCI 2 , 10 
mM Tris-HCI (pH 8.6), 0.5% NP-40, 1 mM DTT, 1000 units/mi RNase inhibitor (Pharmacia)] by using a 

so Vortex mixer for 30 sec and then left standing on ice for 5 min. Nuclei and other cell debris were 
precipitated by centrifuging at 12,000 g for 90 sec, and the supernatant was deproteinized with Proteinase K 
followed by phenol extraction. RNA was precipitated by isopropanol and rinsed with 70% ethanol. Finally, 
the poly A + fraction was collected by oligo dT column fractionation (Aviv., et al., Proc. Natl. Acad. Sci. USA 
69, 1408-1412, 1972). 

55 
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[Example 2] 

Preparation of vector primer DNA and construction of cDNA libraries 

To prepare a vector primer. pUC19 DNA amplified in JM109 cells (Yanisch-Perron, C, et al., Gene 33. 
103-119 1985) was digested with PstJ to completion and a poly T-tail was added with terminal transferase 
(Pharmacia) to a mean length of 26. This process was monitored by the incorporation of 3rMeoxvthym.djne 
liphosphate [Okayama. H.. et al.. Methods in Enzymology (San Diego: Academic Press), vol. 154, pp. 3-28. 
19871 The product was digested by Hindi, and the resulting short fragments were eliminated by 
chromatography with Sepharose S-300. Then the T-tailed plasmid was purified by an oiigo dA column and 
stored in 50% ethanol at a concentration of 1 Ug/UI. 

Ra 1 shows the outline of the construction of the cDNA library. Two myograms each of the 
cytoplasmic Poly A+ RNA and the vector primer DNA were co-precipltated in 70% ethanol containing 0.3 M 
SLtate and *e pellet was dissolved in 12 ul of distilled water. For the first strand synthesis, after heat 
denaairation at 76- C for 10 min. 4 al of 5 x reaction buffer [250 mM Tris-HCI (pH 8.3). 375 mM KCI. 15 
mM MgCk], 2 Ul of 0.1 M DTT and 1 ul of 10 mM each of dATP. dCTP. dGTP and dTTP were added w 
the sample at 37' C. The reaction was Initiated by the addition of 200 units of reverse transcnptase MMLV- 
H-RT (BRL). and after incubation at 37 'C for 30 min. stopped by transferring the reaction tube onto ice. For 
the second strand synthesis, to the aforementioned reaction mixture the following was added: 92 ul of 
distilled water 32 ul of 5 x E. coll reaction buffer [100 mM Tris-HCI (pH 7.5). 20 mM MgCfe. 50 mM (NH^ 
2 S SoTmM KCI. 250 ug/ml of .BSA, 750 uM 0NAD], 3 ul of 10 mM each of dATP. dCTP d Q TP and 
rfTTP 15 units of E coll ligase (Pharmacia). 40 units of E. coll polymerase (Pharmacia), and 1.5 units of E. 
Coli RNase H (Pharmacia). The reaction mixture was then incubated at 16- C for 2 h and heated to 65 -C 
for 15 min. Then 20 units each of BamHI and Mbol were added, and the reaction mixture was incubated at 
ss 37-C for 1 h and heated again at 65'C for 30 min. Finally, the sample was diluted up to 1 ml with 1 xE. 
con reaction buffer, and 100 unite of E. con ligase were added. The resulting mixture was incubated at 
16-C overnight. An aliquot of this mixture was used to transform competent E. col. DH5 ceHs (Toyobo). 
Transformants were selected by ampicillin resistance. The product was named "3' Mbol cDNA library . 

ao [Example 3] 

Amplification of cDNA insert by PCR 

The plasmld-carrier E. coll colonies were picked Into 96-well plates containing 125 ul of LB medium 
as (Davis R W., et al.. Advanced Bacterial Genetics. New York Cold Spring Harbor Laboratory, 1980) in each 
well end incubated in a moist chamber at 37- C for 24 h. A replica culture was made for every plate usmga 
Spired replica device (Sigma) and the master plates were stored at -80-C for future use. After overnight 
incubation at 37-C. 50 Ul of the cutture from each well of these replicas were transferred to polycarbonate 
96-well plates (Techne). Bacteria were collected by centrtfugation in an Omnlspln H4211 rotor (SorvaJI) at 
1500 rprr. for 5 min. resuspended In 50 ul of water, covered with a layer of mineral oil and lysed at 95-C 
for 30 min in a metal bath. Debris were removed by centrHugation at 3600 rpm for 30 mm in the same rotor. 

Rve microliters of the supernatant were added to 20 ul of distilled water and kept at 95- C for 10 m.n 
under a layer of mineral oil. Then the denatured lysate was sublected to PCR by adding 25 ul of 2 x 
SSon mLure [40 mM Tris-HCI (pH 8.9 at 23-Q, 3 mM MgCfe 50 mM KCI. 200 ug getoton/mj 
Containing 5 pmol each of primers, 5 nmol each of dATP. dCTP. dGTP. dTTP and 1.25 unite of Taq DNA 
Dolymerase (Cetus) at 70'C. Temperature cycling reactions were carried out immediately after addition of 
*e reaction mixtures using a thermal cycler either for microfuge tubes (PJ1000. Perfdn Elmer Cetus) or for 
a 96-well plate (PHC-3, Techne); 35 repeated cycles of 30 sec at 96'C, 1 mm at 55-C, and 2 min at 72 C 

without a final extension step were performed. 

For this method, the correct choice of primers for the PCR reaction is crucial. Therefore, preliminary 

tests were performed using the following primers with a predicted Tm of above 60 • C. 

The primers tested were a pair of primers. FW(-47) and RV(-48). which are identical to the commer- 
cially available 24 mer primers, a second pair of primers. [FW(-40) and RV(-29)]. which are a longer version 
Si meVTofthe well-tested sequencing primers, and the primers RV(-71) and RV(-14). whtoh have atnplet 
sequence at the 3' terminus identical with that in FW(-40) but Is in the opposite orientation (Fig. 2A). 

In most of the cases where various combinations of primers were tested, short PCR artifacts appeared, 
besides the expected major products (Fig. 2B. arrows Indicate the PCR artifacts.). These artifacts couW be 
reduced by raising the annealing temperature, lowering the primer concentration or lowering the substrate 
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concentration birt in all cases the yield of the products was not high enough to serve as a template for the 
sequencing reaction without concentration thereof. 

However, since one pair of primers [SW(-40) and RV(-14)] did not yield artifacts {Fig. 3), this pair was 
selected for further tests, and was found to give reproducible results. Similar results were obtained with 
s randomly selected cDNA clones. Therefore, only this pair of primers SW(-40) and RVM4) was used as th« 
primers of the present embodiment 

[Example 4] 
to DNA sequencing 

The PCR products were drop-dialyzed against TE [10 mM Tris-HCI (pH B.0). 1 mM EDTA1 on millioora 

mm diameter. Without further punfication the samples were subjected to the Cycle Sequencing protocol 
ts (Applied aosystems. 1991) using dye labe.ed primers with minor modifications. For dideoxySe 
sequencing reaction. 2 ul of the dialled PCR reaction product (about 0.2 pmol of template DNA) were 
added to 3 Ul of a reason mixture containing 0.4 pmol of FAM M13 (-21) Primer (AppHed Biosystems) in 
160 mM Tns-HCI (pH 8.9). 40 mM (NH^SO*. 10 mM MgCfe, 50 UM dATP. IZifcM dCTP 75 uM 7- 
deaza-dGTP (Boehringer Mannheim Biochemicals). and 50 uM dTTP. 25uMddCTP. 08 unit of Tea 
Polymerase (Perkin Elmer Cetus). and subjected to 15 plus 15 cycles of the reaction (95-C 30 sec 60-C 1 
sec 70-C 1 min and 95'C 30 sec. 70'C 1 min) according to the manufacturer's recommendation in a 96- 
well plate using a thermal cycler (PHC-3. Techno). The three other sequencing reaction for didW 
yguanosme. d,deoxyadenoslne, and dldeoxythymidlne were performed in parallel (with TMRA JOE and 
RC* pnmers respectively, supplied by Applied Biosystems) in an identical fashion, except that twice the 
volume of all the ingredients was added to the dideoxyguanosine and dideoxythymidine reaction EacJ 
sample, from a set of four was cooled to 4-C. pooled, precipitated with ethanol. resuspended in 6 «l of a 
solution of formamide/50 mM EDTA (5/1 by wV). loaded onto sequencing gel and analyzed by a DNA 
autosequencer (Model 373A Ver 1.0.1. Applied Biosystems). anaiyzea D y a dna 

X [Example 5] 

The frequency of appearance of each GS of the cDNA library reflects mRNA population 

To confirm that our 3'-directed regional cDNA library was a non-biased representation of the mRNA 
population In HepG2 cells, the Inserts of four cDNA clones (EF-1or, a-l-antltrypsin. hnRNP core protl aV 
as and .rrter^-trypsin mhibitor) from the clones redundantly obtained by random selection of cDNA were 
radiolabeled and used as probes in a Northern analysis of poly A+ mRNA from the HepG2 cells (The 
results are shown in Fig. 4A-D. and summarized In Fig. 4F.) The relative band Intensity oHhe fou mRlS 
spec.es demonstrated that their relative ratios were 52. 24. 1 and 1.2. respectively (lane iH in Rg^R Then 
the same se of probes was used for measuring the number of colonies hybridizing with each probe in the 
same cDNA library of 8,800 clones (Rg. 4E). ^ 
,„ T!^ 010 "! 1 ^"fies were 307. 128, 7 and 9. or in ratio, 44, 17. 1 and 1.3. respectively (lane iv in Re. 

pm? 689 ^ ^ a9reed ' Sh0wi " 9 the cDNA librar y used « non-biased representation of the 
mRNA Population. The ratio was practically unchanged when different preparations of mRNA from the same 

Fig. 4 shows the proportionality of the composition of the 3'-directed cDNA library and of the mRNA. 
Rg.4A. 2 ug of poly A- RNA from HepG2 cells was electrophoresed in lanes 1-4 ofTformamide agarose 
gel containing ethidium bromide (5 ug/ml) and then exposed to UV. Lane 5 is the RNA ladder (BRL) used 
as size markers (Kb). In Rg. 4B, the fitter was northern blotted using the following *P-labeled S'-specific 
cDNA probes: Ebngafon factor-^ (lane 1). a1 -antitrypsin (lane 2). HnRNP core protein A1 (lane 3) inter^- 
trypsin inhibitor (lane 4). In Rg. 4C. one pmol each of the non-labeled cDNA fragments [EF-1„ (lane 1) «1- 
antitrypsin pane 2). HnRNP core A1 (lane 3). inter-a-trypsin inhibitor (lane 4), were electrophoresed in a 2% 
agarose gel then photographed. Rg. 4D is a Southern analysis of the blotted filer from Rg. 4C using the 
same set of radioactive probes. Lane 5 shows the migration pattern of the reference 1 kb ladder (BRL) 
Hard copies of these screen images were taken at 8 h for b, and 1 h for d. The radioactivity in each band 
was counted directly in a scintiscanner (/J-603; Betagen) and registered in (i) and (ii) in Rg 4F The 
observed band intensities were corrected based on the band intensities in Rg. 4D fii in Ro 4F>" and 
normalized relative to the value of probe 3 (HnRNP core A1. lane III In Rg. 4F) as 1 (ill in Ra 4F> These 
values represent the relative content of each mRNA species in the original mRNA preparation Fig 4E 
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shows the results of colony hybridization of the membranes carrying 8.800 colonies of the 3'-directed cDNA 
library using the same set of the four radioactive probes. Positive colonies were counted and registered fiv 
in Fig. 4F), then normalized with the value of HnRNP core protein A1 as 1. The numbers in B, D and E in 
Fig. 4 represent the probe No. in Fig. 4F. Fig. 4F shows a remarkable agreement between the values of 
s lanes (iii) and (v). 

[Example 6] 

Population study of the cDNA library 

W To analyze further the composition of the cDNA library, 7 and 10 clones were selected from the 
redundant (group I) and solitary (group II) sequence groups, respectively, and these inserts were used as 
radiolabeled probes for colony hybridization (Fig. 6). The frequencies of the colonies that hybridized with 
oroup I probes were roughly identical to those that were randomly picked and sequenced. These 

75 frequencies were about 3.5%-0.1%. Nearly 52% of the cDNA library population consisted of the redundant 
sequence group containing 173 species. When 8 probes from group II were tested. 18 positive colonies 
were Identified among 26.400 colonies screened, giving an average frequency of 0.007%. Two probes did 
not hybridize with any of the 26,400 colonies, resulting in the average frequency of <0.004%. Thus, the 
average frequency of the 10 probes in group II was several orders of magnitude less than the lowest of 

£0 Sroup^ ^ su(ts ^ summari2ed in Rg . 5t showing the appearance frequencies of various DNA species in 
the 3'-directed HepG2 cDNA library. In Fig. 5, seven cDNA probes (a15 through tb042) were selected from 
the 162 Identified genes in the redundant group (group I), and ten (s155 through s632) were randomly 
chosen from the solitary group (group II). In columns A, B and C. each one of the insert DNAs was 

25 radiolabeled and used as a probe for colony hybridization tests of 982 (A). 8,800 (B) or 26.400 colonies (C). 
NT indicates "not tested". The DDBJ entry names of the 17 clones listed in this table are HUM000A15, 
HUM000C321, HUM00TB038, HUMHM01 B02, HUM0C13AO4. HUMHM02D02, HUMO0TB042, 
HUM000S155. HUM000S159, HUM0O0S639, HUM0OOS635, HUM0OOS170, HUM0OOS154, HUMO00S167, 
HUM000S645! HUM0O0S647. and HUM000S632. 
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[Example 7] 

Analyses of sequencing errors 
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All the sequence data presented in this specification were obtained by repeated cycles of enzymatic 
amplification of the plasmid inserts, followed by cycle sequencing with Taq polymerase. Sequences of 60 
clones that showed data bank matches were examined for discrepancies from the data bank entries. It was 
found that the accuracy in the region 1-100 bp distant from the cloning site was 98.7%. indicating that the 
primers or probes designed with the sequence in this region could be obtained practically without any 
erroneous sequences or even H they contain any errors, they are functionally without problems. 



[Example 8] 

Mapping of GS by PGR 

(cDNA sequence) 



cDNA library was constructed from mRNA of DMSO treated HL60 cells. The methods for construction 
of the 3'-directed cDNA library and for sequence analysis of the library components are the same as 
so described in Examples 1-4. 

<PCR primer) 

Primer design was performed by using the computer software OLIGO 4.0 (National Biosciences) which 
55 eliminates possible formation of inter- or intra-molecular secondary structures. In addition to the pnmer 
design transfer of oligonucleotide sequences to the local database and synthesizer were semiautomated 
using a Macintosh computer linked with a network. DNA oligomers were synthesized on an automated DNA 
synthesizer (Model 394, Applied Biosystems) on a 40 nmol scale. The synthesized oligomers were used as 
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PCR primers without further purification. 
(Preparation of Genomic DNA> 

s The human genomic DNA was extracted from the normal karyotype lymphoblastoid cell line GM0130b. 
Mouse and Chinese hamster genomic DNAs were purchased from Clontech. Monochromosomal hybrid 
cells utilized for mapping panel were commonly used ones which have been described previously. Briefly, 
chromosomes 3, 4, 9, 11, 12, 13, 15, 22 and Y were carried in human-Chinese hamster monochromosomal 
hybrid cells, and chromosomes 1. 2. 5, 6, 7, 8, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21 and X were carried 

io in the human-mouse monochromosomal hybrid cells A9 series. The integrity of the hybrid cells were 
monitored by in situ hybridization. 

(Amplification by Polymerase Chain Reaction) 

is PCR was performed according to standard protocols (Saiki, R. K., et al., Science 2&), 1350-1354, 
1985), using 10 pmol of each primer on a whole 20 ul scale reaction, with 35 thermal cycles of 30 sec at 
94 *C, 60 sec at an annealing temperature, and 90 sec at 72'C, using a PerWn-Bmer 9600 thermal cycler. 
Annealing temperature was determined according to the "optional annealing temperature" estimated by the 
Program OLIGO, 

20 

(Analysis of the PCR Products) 

The PCR products were run on an 8% polyacrylamide non-denatured gel (Aery lam iderBis-acrylamide = 
19:1, 1 mm thick) at 300 V for 1 h, followed by staining in 90 mM Tris-borate, 2 mM EDTA buffer solution 
25 containing 0.25 ag/ml ethidium bromide for 15 min. The size of the amplification products were determined 
relative to the 10 bp DNA ladder (BRL). Detection of fluorescence was performed by using a laser 
fluorescent image analyzer (FM-BIO, Hitachi Software Engineering). The image data were transferred to a 
computer for analysis. 

30 (Results of Analysis of the PCR Products) 

Among various species of 3'-directed cDNA-GSs obtained from granulocytoid cells, 195 novel GSs 
which did not match the sequences deposited in Genbank Release 76 were selected and used for 
designing primers for the PCR. The PCR was performed with these primers using the total human genomic 

35 DNA as the template. 

Among the 195 primer pairs, 191 (98%) yielded products whose size matched those expected within 5 
nt The results are summarized in Figs. 6-10 whose figure legends are as follows: GS, gene signature; CN, 
clone name; Chromosomal position, chromosome numbers to which GSs were mapped; Sequence of 
primers, DNA sequences of primers (Sense, sense strand; anti-sense, anti-sense strand); AT, annealing 

40 temperature; HO, Observed size of PCR products with total human genomic DNA (nt); HE, Expected size of 
PCR products with totai human genomic DNA (nt); MO, Observed size of PCR products with mouse 
genomic DNA (nt); CO, Observed size of PCR products with Chinese hamster genomic DNA (nt); G, 
Number of "hits" of GS in the granulocytoid (DMSO treated HL60) cDNA library after analyzing altogether 
1000 clones; T, Total number of "hits" of the GS after analyzing altogether 3000 clones from the three 

45 cDNA libraries of HL60 with and without induction by DMSO or TPA. Question marks ("?*) indicate that the 
PCR products did not yield a clear band. 

"M" indicates that the PCR products yielded a band which was indistinguishable from the band 
observed after the reaction using mouse DNA as the template. Similarly, "C" indicates that the PCR 
products yielded a band which was indistinguishable from the band after the reaction using Chinese 

bo hamster DNA as the template. 

The overall rate of success of the PCR was 191/195 (98%), although GSs were randomly selected from 
the cDNA sequences, indicating that the quality of the cDNA library used in this work was reliable, and that 
the sequence analyses and primer designs were performed appropriately. Thus, the possible chances of 
failure of the PCR caused by presence of an intron(s) in the relevant cDNA sequences is negligible in 

55 working with the GS, as introns virtually do not He in the poly A proximal 3'-region of vertebrate genes 
(Wilcox et al.. Nucleic Acids Res, 19, 1837-1843, 1991). This is a big advantage compared to the use of 
partial fragmented cDNA sequences obtained from randomly primed cDNA libraries (Adams et al., Science 
252 , 1651-1658, 1991) or from S'-directed cDNA libraries. 
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(Chromosomal assignments of GS> 

TTie 191 primer pairs that yielded PCR products from total human DNA were used for chromosomal 
assignments of the GSs with the monochromosomal hybrid cell panel. At least 1 1 9 GSs were assigned to a 
sinale chromosome. As an example. GS001418, shown in Fig. 11. was assigned to chromosome number 3. 
With some clones, extra products were obtained, some of which were assigned to the same chromosome, 
whereas others to different chromosomes. An example. GS001457. is shown in Fig. 12, Sixty-two (33%) 
clones yielded the expected PCR products with two or more different chromosomes. Th.rty-frve cases 
(18%) yielded PCR products whose size were indistinguishable from background rodent genomic DNA 
Among these. 21 GSs produced products indistinguishable from mouse and Chinese hamster DNA. Ten 
GSs yielded no expected PCR products with the monochromosomal cell panel DNA although the expected 
PCR products from total human genomic DNA were observed. The 10 cases probably arose from a small 
deletion in the hybrid cells. Five clones obtained from HepG2 cDNA library have been analyzed also by 
Sern blot analysis. Four out of the 5 GSs (GS000053. GS000120, GS000271 and GS000279) gave 
consistent results with those obtained by the PCR. One GS (GS000228). which was uncertainly assigned to 
chromosome Y because of the weak signal detected by the Southern blot method, was assigned to 
chromosome 11 by PCR. 

[Example 9] 

Mapping of GS by Southern blot method 
(Cell lines) 

Total human genomic DNA was isolated from the human normal karyotype lymphoblastoid cell line 
GM0l30b. Monochromosomal hybrid cells used as the mapping panel are shown in Fig. 15. Hybrid fl&fpeo- 
xVy ceils as described by Koi, et al. (Jpn. J. Cancer Res. 80, 413-418, 1989) were donated by Dr. M. 
Oshimura Faculty of Medicine, Tottori University, passaged 3 times and frozen for storage. The loss or 
rearrangements of chromosomes could have occurred during this period. The GM series was obtained from 
the Mutant Cell Repository, National Institute of General Medical Science (NIGMS) (Camden, NJ). To 
confirm that human chromosomes remained intact in the hybrid cells after storage in liquid nitrogen, 
metaphase spreads of the hybrid cells were monitored by chromosome staining based on in situ 
hybridization using biotinylated total human DNA as the probe (Dumam. D. M.. et al.. Somatic cell Mol. 
Geneta. 11, 571-577, 1985) Intact, as well as translocated or fragmented human chromosomes were easily 
detectedTy this means. In a hybrid cell mapping panel, chromosomes 11, 12 and 15 were represented by 
the hybrid cell lines A9(neo-1l)-1. A9(neo-l2H and A9(neo-l5)-2, respectively, and in another panel, they 
were represented by the hybrid cell lines GM10927A, GM10868 and GM11418. respectively. 

(Isolation of genomic DNA and Southern blotting) 

High molecular weight DNA was extracted from cells using sodium dodecyl sulfate (SDS) and 
Proteinase K. followed by phenol-chloroform extraction and ethanol precipitation. DNAs were digested 
overnight with a combination of two restriction enzymes Including EcoRI. BamHl and Bglll. About 5 ug of 
each digest was electrophoresed in an 0.8% agarose gel, then transferred to Hybond N + membrane 
(Amersham) with 0.4 N NaOH. The membrane was rinsed in 2 x SSC and stored at 4- C for subsequent 



use 



Clones containing a novel sequence and having more than 150 bp were selected as probes. The cDNA 
inserts of the clones were amplified by the PCR. The PCR products were isolated by electrophoresis 
through a 2% low-melting temperature agarose gel (Nusieve : SeaPlaque. 3:1). followed by excision. The 
gel was removed by melting at 65* C and digesting with ^-Agarose I (Bio Labs) at 40* C for 1 h. The 
probes were labeled with [a-^PJdCTP by random priming using a commercial kit (Amersham). Hybridization 
proceeded at 65* C in a high salt buffer containing GxSSC. 1x Denhardt's solution and 0.5% SDS. in the 
presence of 0.1 mg/ml of sonicated, denatured salmon sperm DNA. The membranes were washed in 
2xSSC, 0.1% SDS at 65- C for 30 min, then twice for 30 min in O.lxSSC, 0.1% SDS at 65 -C, and analyzed 
using a Fuji BAS-2000 imaging analyzer. 
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{Analyses with Genomic DNA) 

Among the HepG2 3'-directed cDNA libraries described in Examples 1 and 2, 160 novel clones were 
selected and used as probes for Southern blots. 

s Total human genomic DNA was isolated from a cell line GM0130b that has a normal karyotype, and 
digested with the restriction enzymes, EcoRI, BamHI and Bglll aJone or in combination. The GS clones used 
as probes were the 3'-dtrected cDNAs. Each of these cDNAs covers a region between the poly(A) site and 
the nearest Mbol site (GATC) (Okubo, K„ et al., Nature Genetics 2, 173-179, 1992) and thus do not have 
restriction sites for BamHI or Bglll. In addition, because the average size of GS is 270 bp, the chances of 

;o having an EcoRI site in the cDNA moiety were not high, in fact, only 7 clones out of the 160 analyzed had 
an EcoRI restriction site. 

Membranes blotted with digested human genomic DNA were hybridized with radio-labeled GS probes 
and washed at high stringency. Since the 3-terminal region of cDNA has, in general, a unique sequence 
which differs from that of protein encoding regions which tend to have conserved motifs, cross hybridization 

75 with unrelated cDNA sequences will not occur under such stringency. Examples of the results of 
hybridization are shown in Figs. 13 and 14. Clones s503 and s632 (Figs. 13a and 13b; junk) respectively 
represent unique single band producers. As shown below, 67 clones belonged to this class. The positions of 
the GS sequence relative to the restriction sites were inferred from the band patterns. Clone s311 (Fig. 13c; 
GS000092) showed a single band with EcoRI -as well as (EcoRI + BamHI)-digested DNA, but two bands of 

20 different sizes In other double digests. The double digestion thus helped resolve multiple GSs. Similar 
results were obtained with clone c13a08 (Fig. 13d; GS000055), in which there were 2 bands with EcoRI- or 
(EcoRI + BamHI)-digested DNAs, and 4 when digested with (EcoRI + Bglll) or (BamHI + Bglll). On the other 
hand, 4 hybridization bands appeared with clone s479 with EcoRI alone, but the number of bands 
decreased with (EcoRI + Bglll) and (BamHI + Bglll) (Fig. 14e; GS000180). These results indicate that 

25 genomic DNAs should be digested in various ways to reveal the maximum number of hybridizing 
fragments. The results of the analysis showed that 41, 10, 7 and 19 clones contained 2, 3, 4 and 5 or more 
bands, respectively. Clones s14f01 and tw1-46 (Rgs. 14f and 14g; GS000407 and junk, respectively) 
contained at least 10 bands in each lane. Since the EcoRI restriction site is not present in the two GS 
sequences, the multiplicity of bands is likely to represent the multiple copy number of these genes. Clone 

30 kmb07 moved as a smear (Fig. 14h; junk), even after intensive high stringency washes, suggesting that this 
probe has a repetitious sequence which has not been hitherto identified. 

(Chromosomal assignments) 

35 A set of monochromosomal hybrid ceils carrying a single human chromosome in a background of 
rodent chromosome was collected (Fig. 15). Thirteen ceil lines were microcell hybrids established by Koi et 
al. (Koi, M., et al., Jpn. J. Cancer Res. 80, 413-418, 1989) and the others were obtained from NIGMS. The 
results of monitoring the human chromosomes in these cell lines by in situ hybridization using biotinylated 
total human DNA are also presented in Fig. 15. 

40 The GSs were assigned to chromosomes using hybrid cell mapping panels. Three types of membranes 
were prepared, each having DNAs prepared from hybrid cells, and digested with EcoRI, (EcoRI + BamHI), or 
(BamHI + Bglll). Among these three types of membranes, the one which should have yielded the maximum 
number of bands was used for each GS probe, according to the results of total genomic Southern blots. 
Examples of hybridization results are shown in Rgs. 16-22. The numeral on each lane represents the 

45 human chromosome numbers which is contained in the hybrid cell, and H stands for the total human 
chromosomes. Clone s14g02 (GS000152; Fig. 16) that showed a single hybridization band with the total 
human DNA digested with EcoRI (lane H), showed the corresponding band only with the hybrid cell line 
containing human chromosome 4. Thus, this GS lies In chromosome 4. 

The clone s650 (GS000041; Fig. 17) was assigned to chromosome 12 which showed a characteristic 

so 7.5kb band in the presence of an (EcoRI + BamHI)-digested membrane. However, with an EcoRI digested 
DNA, the clone could not be assigned, as the human-specific and the cross-reacting rodent DNA fragments 
overlapped. The single, but shorter fragment band (1.3kb) which appeared in lanes 3, 4, 9, 13 and 22 
represents the homologous DNA sequence in Chinese hamster, and the 3.3kb band in other lanes 
represents the homologous DNA in the mouse. 

55 Clone hm01e01 (GS000181; Fig. 18) exhibited two fragments when hybridized to total human DNA 
treated with EcoRI alone, and these corresponding bands appeared in lanes 1 and 2. Thus, the two 
members of this gene family are located on two chromosomes. 
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Fie 19 shows that clone c13a08 (GS000055) exhibited 4 bands when hybridized to (BamHI + ^lll)- or 

" id t !25b JK iTSUl 7*b bSds Ae totai DNA both consist oi , *o overiapptog 
f rl'nlerrts As shown in lane H. the intensity of these overlapping fragments was h.gher than normal. The 
SESXZ iSTh.B8w3i as in lane 11 was a.so intense, suggesting that it also represents overling 

^Sone'slTa (GS000094) exhibited 5 bands in EcoRI-cleaved total DNA (Fig 21) Four corresponding 
done i ;i 7J v«u j 4 5kb ^ was o5s arved m lane 4. indicating that 

^TZXZ^Ze^Z. >n addtoon, an intense 3.1Kb band was ob^ ed in lane 17. 
STESSa Ounk; Fig. 22) exhibited many bands with total DNA. and I w.th those from monoch- 
. h vh ?Hs This clone must represent a multiple and closely related fam.ly of genes. It also 
2 L-d in «ous rodent genes which a.so ghve rise to -u^ands. Smce 
mTsToTtSe human specific and rodent bands overlapped, the chromosomes could not be assigned. Other 
combinations of ^^^Z^T^ ^^ assignments of 160 GSs are 
summit F?g 23 ^ K^gh toV genomic DNA analyses using 4 different digested human 
DnTI? InJ "we* .categorized I a single band group. 41 in a two band group. 10 in a three band 
gro£ 7 in a 7o* band group and 19 In a group that yielded five or more bands. Nine clones d,d not show 

^zttXttszssz* *• - — - * « hromosome The sees 

«JZZ *e gene represented by clone s317 originated from the same chromosome The three tend 
£™ Job ^^00412) and s401 (GS000224) showed that two of the fragments to on the same 
PlJSSgfffasOOBW) and s17a10 (GS000294) showed ^ds in different chro- 
^« CtoM. disDlavino four or more bands showed a relatively dispersed distnbuton among 
Xro^miTnk" inSple 9 b the DNA segment cloned by the same method used for GS but is not 
numbered. 

[Example 10 Cloning of gene using GS] 

[10A Cloning of a full length cDNA encoding a human ribosomal protein, rwrnologue of yeast S28. Cloning 
of the full length cDNA by PCR using a primer comprising a partial sequence of a GS(i)] 

.. .„„ » «Hm«r rs-TGAAAATTTATTACTACAGTGTTTTCACCA-3' (SEQ ID NO:7839)) that is a partial 
JSiSTaS ^^SSS the same as the complementary strand of HUMGS0050 0 a nd a 
sequence or a complementary to the vector (pSPORT) 

ZSn e M *;^SSS23 £2V^ of the cDNA, HepG2 cDNA library was amplified by the PGR 
3TS length cDMA done encoding a human ribosomal protein, a homologue of yeast nbosomal protein 
S28 was isolated. (Hori et al.. Nucl. Adds Res. 21: 4394. 1993). 

[10B. A human ribosomal protein homologous to rat 15 ribosomal protein-Cloning of the full length cDNA by 
PCR using a primer comprising a partial sequence of a GS(2)] 

Usina a primer 5'-CTTCTTTCTGTAGCCAGGTAACTCT-3' (SEQ ID NO: 7841) that is a partial sequence 
of a SSiSSTS substantially the same as the complementary strand of HUMGS00418 and a pnmer (SEQ 
D nSw! ^complementary to the vector (pSPORT) sequence that is located extern* to the ^ndclft. 
cDNA. a fall length cDNA clone encoding a human ribosomal protein homologous to rat L9 was isolated 
(Hori et al.. Nud. Acids Res. 21:4395. 1993). 
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[100 A human protein homologous to bovine phosphatidylethanolamine-binding protein. Cloning of the full 
length cDNA by hybridization using a probe comprising a partial sequence of a GS] 

By hybridization with the probe, 

5 

5'-GATCGTTCTTCATGGGGGTAAGAAAAGCTGCTCTGGAGTTGCTGAATG 
10 TTGCATTAATTGTCCTGTTTC 

GAAGGAAA-3' (SEQ ID NO:7838), 

T5 

that comprises a partial sequence of HUMGS00421, a full length cDNA clone encoding a human protein 
homologous to bovine phosphatidylethanolamine-binding protein was isolated (Hon et al., Gene 140:293, 
1994). 

20 [10D. Human mpl-ligand. Cloning of a cDNA coding for the human mpHigand using a GS] 

This embodiment employs the 5' SLIC (single ligation to single stranded cDNA) method which is an 
improved version of the 5'RACE (rapid amplification of cDNA ends) method, and is described in Nucleic 
Acids Res., 19, 5227-5232 (1991). 

25 

® Reverse transcription of cDNA and attachment of anchor 

The template was prepared using the reagents of the S'-Amplifinder™ KH (Toyobo, Inc.) in accordance 
with the protocol included therewith. Specifically, 2ug of human fetal liver poly A + RNA (Clontech Laborato- 

30 ries, Inc.) and 10 pmol of the primer PA-6, a primer corresponding to the 3' end of the gene signature (GS) 
sequence HUMGS02342 and consisting of the sequence 5'- 1 I 1 1 CGGCGCTCCCATTTATTCCTT-3' (SEQ 
ID NO: 7842), were mixed together and then denatured by heating the mixture at 65 # C for 5 min. The 
cDNA was synthesized by combining the denatured sample with AMW reverse transcriptase, RNase 
inhibitor, dNTPs, and a reaction buffer, and then heating the resultant mixture at 52' C for 30 min. EDTA 

35 was then added to the mixture to stop the reaction. Thereafter, the RNA was hydrolyzed by adding NaOH to 
the reaction mixture and heating the resultant mixture at 65 *C for 30 min. The mixture was then neutralized 
with acetic acid. A suspension of glass beads (GENO-BIND™) and Nal were added to the neutralized 
solution and the cDNA was adsorbed onto the beads. The cDNA, adsorbed onto the beads, was washed 
with an aqueous solution of 80% EtOH, and then eluted in 50 u I of distilled water. Glycogen was added to 

40 the solution of purified cDNA, and the cDNA was precipitated with EtOH and resuspended in 6 ul of distilled 
water. The resultant suspension (2.5 u.1) was added to a solution containing 4 pmol of Amp li FINDER Anchor 
(S'-CACGAATTCACTATCGATTCTGGAACCTTCAGAGG NH 2 -3*) (SEQ ID NO: 7843) provided with the KH. 
T4 RNA ligase, and a ligation (reaction) buffer. The reaction mixture was incubated at room temperature 
overnight, and the Ampli FINDER Anchor primer In the reaction mixture was thereby ligated to the 3* end of 

45 the cDNA. The ligated product was then used as a template for the subsequent PCR. 

@ Amplification by PCR 

The primary PCR was carried out using the template produced in the procedure described above ((p), 
so the Anchor primer, S'-CTGGTTCGGCCCACCTCTGAAGGTTCCAGAATCGATAG-S' (SEQ ID NO: 7846) and 
the PA-5 primer consisting of the sequence 5'-CTCG CTCG CCCATCCTTATACAGGCTC AGTTTTGTCT -3* 
(SEQ ID NO: 7844). Specifically, 1 jil of the template was mixed with Taq DNA polymerase (Takara Shuzo 
Inc., Code No. R001 A), dNTPs, a PCR buffer, and 10 pmol each of the PA-5 primer and Anchor primer. The 
resultant reaction mixture was diluted with distilled water to a final volume of 50 ul and the PCR was 
55 performed in a DNA Thermal Cycler 480 (Perkin Elmer Cetus Corp.). The reaction mixture was subjected to 
40 cycles of the PCR, wherein each cycle consisted of incubating the sample in sequence at 94 • C for 1 
min, 63 -C for 1 min, and 72' C for 3 min and. in the last PCR cycle, at 72 *C for an additional 8 min. The 
products of the PCR were resolved by electrophoresis in a 1% agarose gel and a broad band of 
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approximately 800 bp in length, representing a product of the PCR. was detected. The detected band was 
excised from the agarose gel and the DNA contained therein was recovered using a Sephaglas Bandprep 
Kit™ (Pharmacia Corp.) in accordance with the protocol Included therewith. Specifically, the gel was 
dissolved in a solution of Nal and the resultant mixture was heated at 60 • C for 10 min. Sephaglas BP 
was added to the gel mixture and the DNA was adsorbed onto the glass beads contained therein. The glass 
beads containing the adsorbed DNA, were then washed three times with a Wash Buffer provided with the 
Kitandelutedin30ulofTEbuffer(l0mMTris-HCIpH8.0,1mMEDTA). 

One Ul of the eluted DNA was used as a template in a secondary PCR. In order to enhance the 
specificity of the secondary PCR, the reaction was performed with PA-4 primer which consisted of the 
sequence 5'-CTCGCTCGCCCATGTATAGGGACAGCATTrCTGAGAG-3' (SEQ ID NO: 7845) and was pes,- 
tioned within the template sequence internal to the PA-5 primer and the Anchor primer. Specifically ^ ul of 
the template was mixed with 2.5 units of Taq DNA polymerase (Takara Shuzo Inc.. Code No. R001A). 
dNTPs a PCR buffer, and 10 pmol each of the PA-4 primer and Anchor primer. The resultant reaction 
mixture was diluted with distilled water to a final volume of 50 ul preheated at 94- C for 6 m.n. and the 
secondary PCR was then performed under the same conditions described above (®) for the primary PCR. 
The Droducts of the secondary PCR were resolved by electrophoresis in a 1% agarose gel and a broad 
band of approximately 800 bp in length, representing a product of the PCR, was detected The detected 
band was excised from the agarose gel and the DNA contained therein was recovered and purified under 
the same conditions as described above ((D) for the primary PCR. 

© Subcloning into plasmid vector 

The purified DNA product of the secondary PCR was subcloned Into the plasmid vector pUC18 
(oharmacia Corp.). using a SureClone™ Ligation Kit (Pharmacia Corp.) in accordance with the protocol 
deluded therewith. Specifically, the purified DNA was added to a solution containing Klenow polymerase, 
polynucleotide kinase and a reaction buffer, mixed and heated at 37-C for 30 min in order to create blunt- 
ended termini and to phosphorylate the 5' terminus of the DNA molecules contained In the reaction mixture. 
The blunt-ended and phosphorylated DNA was combined with a solution containing 50 ng of a de- 
ohosphorylated and Sma hcleaved P UC18 vector provided with the Ligation Kit. T4 DNA ligase. DTT and a 
Ugatfon reaction buffer, and the resultant mixture was wanned at 16 -C for 3 hr One sWh volume of the 
reaction solution was employed to transform E. coil competent cells using standard methods. Spedfically 
frozen E coB competent cells (Wako Pure Chemical Industries. Ltd.) were thawed and mixed with the 
Hoated DNA. The resultant mixture was incubated on ice for 20 min. heat-treated at 42 -C for 45 sec. and 
then incubated on Ice for 2 min. A medium [Hl-Competence Broth (Wako Pure Chemical Industries Lid.)] 
was added to the mixture containing the transformed E. coli cells. The mixture was mcubated for 37 • C for 1 
hTand then spread onto agar plates containing 100 ug/M Ampicillin. 40 ug/rnl X-Gd (6-brorrK>-4^hloro-3- 
indofvM-D-galactoslde). 0.1 mM IPTG (isopropyl-/5-D-thiogalactopyranoslde) and cultured overnight at 37 
•C Whitecolonies were selected from the colonies which consequently appeared on the agar plates and 
analyzed by the PCR to determine the presence or absence of the DNA Insert. Specifically, a sample of a 
setected colony was picked with a sterilized toothpick and used to Inoculate a 50 u 1 reaction solution 
coSning 1 unit of Taq DNA polymerase. dNTPs. PCR buffer, 200 U.M each of the M13 P4-22 primer 
SnS of me sequence 5-.CCAGGGTTTTCCCAGTCAC6AC-3' (SEQ ID No: 7847) and M13 P5-22 
pTlmef casting oMne sequence tr-TCACACAGGAAACAGCTATGAC-3' (SEQ ID No: 7848). wherein both 
Primers are comprised of sequences complementary to the pUC18 vector. The resultant mixture was 
heated at 94- C for 6 min and then subjected to 30 cycles of the PCR wherein each cycle cons«ted of 
incubating the sample in sequence, at 94-C for 1 min, 55'C for 1 min. and 72 • C for 2 mm. The amplrfed 
insert was detected by electrophoresis of the PCR products on an agarose gel and thereby the clone 
pR02342-2, containing an insert, was selected. 

0 Sequencing of cDNA 

The plasmid DNA was prepared using the QIAPrep-Spin Kit (Funakoshi. Ltd.) in accordance with the 
standard alkali-SDS protocol included therewith. Specifically. E. coli cells transformed with the DNA of clone 
PR02342-2 were cultured overnight In Luria Broth medium containing 100 ug/rnl Ampicillin. The cultured 
cells were then pelleted by centrifugation and resuspended in P1 solution provided in the Kit. The resultant 
cell suspension was mixed with the P2 alkali solution of the Kit. incubated at room temperature for 5 min. 
neutralized with N3 solution of the Kit. Incubated on Ice for an additional 5 min and then centrifuged. The 
supernatant obtained from the centrifuged solution was applied to a QIAPrep-Spin column. The Spin column 
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was then washed in sequence with PB and then PE solution of the Kit and the DNA was eluted from the 
column with TE buffer. Sequencing of the eluted DNA was then carried out using the sequencing kit 
PRISM™ Terminator Mix (Applied Biosystem Corp). Approximately 1 ug of the purified DNA was mixed 
with a solution containing 3.3 pmol of either the M13 P4-22 primer or M13 P5-22 primer and 9.5 ul of 

s PRISM™ Terminator Mix. The M13 P4-22 and M13 P5-22 primer were used to sequence both strands of 
the DNA insert of clone pR02342-2. The resultant mixture was diluted to a final volume of 20 ul with 
distilled water and subjected to 25 cycles of the PCR wherein each cycle consisted of incubating the 
sample in sequence at 96 'C for 30 sec, 50 *C for 15 sec, and 60 *C for 4 min. The excess primers and 
fluorescent dye present in the reaction mixture were removed by gel filtration using a MicroSpin™ S-200 

70 HR column (Pharmacia Corp.) and the DNA products of the sequencing reaction were precipitated with 
EtOH. The precipitated DNA was resuspended, sequenced using an automated sequencer, "Model 3 73 A" 
(Applied Biosystem Corp.), and thereafter analyzed to determine the nucleotide sequence. 

The analysis of the nucleotide sequence revealed that the insert of clone pR02342-2, including the PA-4 
primer, was 608 bp in length. The sequence of this insert was subjected to a search for homologous 

is sequences entered in the Gen Bank data base, and a 100% match was found to a sequence in the cDNA 
which encodes the human mpl-ligand (Accession No. L 33410, Nature 369, 533-538, 1894). Further 
comparison of the insert of clone pR02342-2 with the cDNA sequence of the human mpl-ligand revealed 
that the cloned insert contained 81 bp of the 3* coding region of open reading frame. In addition, the insert 
of clone pR02342-2 contained an additional sequence extending beyond the 3* end of the human mpl- 

20 ligand cDNA sequence registered under Gen Bank Accession No. L 33410. These findings suggest that, 
using the GS HUMGS02342, the inventors of the present invention succeeded in cloning a cDNA clone 
pR02342-2, which could possibly have a different and more desirable property for expression than the 
human mpWigand cDNA represented by the sequence registered under Gen Bank Accession No. L 33410. 

25 © Cloning of the full-length cDNA encoding the human mpl-ligand 

In order to find an optimal PCR primer, an appropriate computer program is used to search the 
sequence downstream of the coding region of the human mpl-ligand (clone pR02342-2) and thereby a 
primer PA-7 is designed and synthesized. A PCR similar to that described above in © is performed using 

30 the template produced by the procedure described above in ©, the Anchor primer, and the PA-7 primer. 
Specifically, 1 u. I of the template is mixed with 2.5 units of Taq DNA polymerase (Takara Shuzo Inc., Code 
No. R001A), dNTPs, a PCR buffer, and 10 pmol each of the PA-7 primer and Anchor primer. The resultant 
reaction mixture is diluted with distilled water to a final volume of 50 u I and the PCR is performed in a DNA 
Thermal Cycler 480 (Perkin Elmer Cetus Corp.) under conditions similar to that described above In @. TTie 

35 products of the PCR are then resolved by electrophoresis on a 1 % agarose gel and a band greater than 
1300 bp in length, representing a product of the PCR, is recovered and cloned into a suitable vector in a 
manner simitar to that described in ®. The cloned DNA is sequenced in a manner similar to that described 
in @. The sequence is then compared to that of the human mpl-ligand cDNA registered under Gen Bank 
Accession No. L 33410 to confirm the presence of the full-length open reading frame. 

40 Alternatively, using the Takara La PCR Kit (Takara Shuzo Inc., Code No. RR011) in accordance with the 
protocol included therewith, performing the 5* RACE procedure using primers similar to those described 
above in ©. a cDNA of approximately 2 Kb in length, corresponding to the human mpl-ligand, was isolated. 

The tables of appearance frequencies for all GSs related to the present invention are followed by 
"Sequence Listing" for these GSs, wherein HUMGS numbers after the heading 'clone' represent GS 

45 numbers. In the sequence table, N in the base sequence stands for "A or C or G or T or U". However, 
since nucleic acids in the Sequence Listing are DNAs, n T or U" stands for T in this case. 

By the present invention, It has become possible to provide DNA molecules which carry "the 
information for expression" In various cells and can be used for detecting and diagnosing the cellular 
abnormalities, recognizing and identifying cells and further efficiently cloning genes which are expressed in 

so a tissue-specific manner, and furthermore cloned DNA molecules which can be used for the production of 
proteins useful as pharmaceutical products. 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 

(i) APPLICANT: NAME : CHUGAI PHARMACEUTICAL CO., LTD. 

(B) STREET: 41-8, Takada 3-chrome, Toshima-ku 

(C) CITY: Tokyo 

(E) COUNTRY: JAPAN 

(F) ZIP: 171 

(ii) TITLE OF INVENTION: GENE SIGNATURE 

(iii) NUMBER OF SEQUENCES: 7848 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.5 xn., DS, 1.44 MS 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/ MS-DOS 

(D) SOFTWARE: MS-DOS . 

(v) CURRENT APPLICATION DATA 

(A) APPLICATION NUMBER: EP 95900295.7 

(vi) PRIOR APPLICATION DATA 

(A) APPLICATION NUMBER: PCT/JP94/01916 

(B) FILING DATE: 11. November 1994 
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GATCATCACT AGCAGATGTC AGTTGCACAT TGAGTCCTTT ATGAAATTCA TAAATAAAGA 60 
ATTGTTCTTT CTTTGTGGTT TTAATAAGAG TTCAAGAATT GTTCAGAGTC TTGTAAATGT 120 
TATTTTAATA ATCCCTTTAA ATNTNATCTG TTGCTGTTAC CTCTTGAAAT ATGATTTATt 180 
TAGATTGCTA ATCCCACTCA TTCAGGAAAT GCCAGGKAGG TATTCCTTGG GGAAATGGTG 240 
CCTCTTACAG TGTAAATNTT NCCTCCTGNA CCTTTGCTAA TATCATGGCA GANTTNNCCT 300 
NATCCCTTTG TGAGGCAGTT TN 322 

SEQ ID NO: 5 251 
SEQUENCE LENGTH: 21 5 
SEQUENCE TYPErnucleic acid 
TOPOLOGY: linear 
CLONE :HUHGS06269 
SEQUENCE DESCRIPTION: 

GATCAAAAAT AAGATTACAG TTAAAATATT NCTATATTCA GATGGTTTAG AGACCAGGCT 60 
GTAGAATCAG ACAGCCCTGA AATTGTATCA CACANGGCTG TGTTACCCTG TACAAATAAC 120 
TTAGCCTTAC TAAGCCTGTA TTTCCTCATC TGCAAANTAG GGNTGNTAAT ATACCTGTNG 180 
NTAAANATGT TTTCATTAAA AANCGTTGGC GCAAA 215 

SEQ ID NO: 5252 
SEQUENCE LENGTH: 229 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
CL0NE:HUHGS06270 
SEQUENCE DESCRIPTION: 

GATCTAAAGT GCAGCAGAGT GGCTGNTGCT GCAAGTNATO TCTAAGGCTA GGAACTATCA 60 
GGTGTCTATA ATTGTAGCAC ATGGAGAAAG CAANTGTAAA ACTGGATAAG AAAATTATTT 120 
TGGCAGTTCA GCCCNTTCCC TTTTTCCCAC TAANTTTTTN CTTAAATTAC CCATGTAACC 180 
ATTTNANCTC TCCAGTGCAC TTTGCCATTA ANGTCTCTGC ACATTGAAA 229 

SEQ ID NO: 5253 
SEQUENCE LENGTH: 219 
SEQUENCE TYPE:nucleic acid 
TOPOLOGY: linear 
CL0NE:HUMGS06271 
SEQUENCE DESCRIPTION: 

CATCGTGAAA GAGGCTTACC CAGACCACAC ACANGTTTGA GAAAAACAAT CCCCATTATN 60 
ACCCATCTAG CAAAGAGGAC AACCCTAAGT GGTCCATGNG TGGATGTACA GTTTGTNCGG 120 
ATGATGAAGC GTTTCATTCC CCTGGCTGAG CTCAAATCCT GTCATCAAGG CNCACAANGC 180 
TACTGNTGGC CCCTNAAAAA ATATTGTTNT NTGTCACTN 219 

SEQ ID NO: 5254 
SEQUENCE LENGTH: 144 
SEQUENCE TYPErnucleic acid 
TOPOLOGY: linear 
CLONE: HUMGS06272 
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A DNA probe consisting of a purified single-stranded DNA . a punf.ed s.ngle-stranded DMA com- 
Irnelry thereto, or a purified double-stranded DNA consisting of said single strands, containing all 
vaZZ oTsWe-sL*** DNA or a single-stranded DNA complementary thereto comprising any 
of the bZ sequences listed under SEQ ID NO 1-7837 and hybridizing specifically to a particular srte 
of human genomic DNA, human cDNA or human mRNA. 

A DNA primer consisting of a purified single-stranded DNA, a purified single-stranded DMA com- 
ImelryThereto. or a purified double-stranded DNA consisting of said single strands, containing all 
o m EK, of a si^e-stranded DNA or a sing.e-stranded DNA complementary thereto comprising any 
of the base sequences listed under SEQ ID NO 1-7837 and hybridizing specifically to a particular s.te 
of human genomic DNA. human cDNA or human mRNA. 

A purified single-stranded DNA. a purified single-stranded DNA complementary thereto or a purified 
doEJtranded DNA consisting of said single strands, containing all or a portion oi a 
ml o a single-stranded DNA complementary thereto, wherein ^id single-stranded DNA is com- 
olementary to a human mRNA containing any of the base sequences listed under SEQ ID NO 1-7837 
SreHTs read as U) or any portion thereof at its 3' region and hybridizing specifically to a 
particular site of human genomic DNA. human cDNA or human mRNA. 

A DNA probe consisting of a purified single-stranded DNA. a purified single-stranded DNA complemen- 
ted or a purified double-stranded DNA consisting of said single strands, containing al or a 
*ZZ Z a sinale-stranded DNA or a single-stranded DNA complementary thereto, wherein said I s.ngle- 
^^^J^S^Y to a human mRNA containing any of the base sequences listed under 
SECI ID NO 1 -7837 (herein T is read as U) or any portion thereof at its 3' region, and hybndizing 
specifically to a particular srte of human genomic DNA. human cDNA or human mRNA. 

A DNA primer consisting of a purified single-stranded DNA. a purified single-stranded DNA com- 
Im^Thereto. or a purified double-stranded DNA consisting of said single strands, containing all 
o a poZ> oH s ngle-sLded DNA or a single-stranded DNA complementary thereto. wherein ^ said 
sinale-sJanded DNA is complementary to a human mRNA containing any of the base sequences l.sted 
under SEQ ID NO 1-7837 (wherein T is read as U) or any portion thereof at its 3' region, and 
^bridizlg specifically to a particular site of human genomic DNA. human cDNA or human mRNA. 
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SEQ ID NO: 7844 
SEQUENCE LENGTH: 37 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
SEQUENCE DESCRIPTION: 

CTCGCTCGCC CATCCTTATA CAGGCTCAGT TTTGTCT 37 



\ SEQ ID NO: 7845 

j SEQUENCE LENGTH: 37 

| SEQUENCE TYPE: nucleic acid 

| STRANDEDNESS: single 

| 75 TOPOLOGY: linear 

| SEQUENCE DESCRIPTION: 

j CTCGCTCGCC CATGTATAGG GACAGCATTT CTGAGAG 37 

] 20 SEQ ID NO: 7846 

j SEQUENCE LENGTH: 38 

\ SEQUENCE TYPE:nucleic acid 

I STRANDEDNESS: single 

! TOPOLOGY: linear 

\ 25 SEQUENCE DESCRIPTION: 

] CTGGTTCGGC CCACCTCTGA AGGTTCCAGA ATCGATAG 38 



SEQ ID NO: 7847 
SEQUENCE LENGTH: 22 



* 
I 

j SEQUENCE TYPE: nucleic acid 

i 

j 35 



STRANDEDNESS: single 
TOPOLOGY: linear 
SEQUENCE DESCRIPTION; 

CCAGGGTTTT CCCAGTCACG AC 22 

40 SEQ ID NO: 7848 

SEQUENCE LZmTK:22 
SEQUENCE TYPE:nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
SEQUENCE DESCRIPTION: 

TCACACAGGA AACAGCTATG AC 22 
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A purified ^"tnnded DNA, a purified single-stranded DNA complementary thereto, or a purified 
double-stranded DNA consisting of said single strands, containing all or a portion of a s ngle-stxanded 
under SeVTd N? T 837 ^ ™^ *» °< the base sequences Sed 
C ^aIZ^Z:T hybndl2m9 SPeC1,,Ca " y ,0 3 * ° f ^ »■"* DNA, 
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Fig:. 7 
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Hybrid cells used for Southern hybridization 
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Chromosomal mapping of each GS by Southern blot technique 
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